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Abstract 

We  compare  three  methods  for  refining  estimates  of  invariant  subspaces,  due  to  Chatelin, 
Dongarra/MolerAVilkinson,  and  Stewart.  Even  though  these  methods  all  apparently  solve  dif- 
ferent equations,  we  show  by  changing  variables  that  they  all  solve  the  same  equation,  the 
Riccati  equation.  The  benefit  of  this  point  of  view  is  threefold.  First,  the  same  convergence 
theory  applies  to  all  three  methods,  yielding  a  single  criterion  under  which  the  last  two 
methods  converge  linearly,  and  a  slightly  stronger  criterion  imder  which  the  first  algorithm 
converges  quadratically.  Second,  it  suggest  a  hybrid  algorithm  combining  advantages  of  all 
three.  Third,  it  leads  to  algorithms  (and  convergence  criteria)  for  the  generalized  eigenvalue 
problem.  These  techniques  are  compared  to  techniques  used  in  the  control  systems  commun- 
ity. 


^ri 


1.  iDtrodDction. 

Methods  for  refining  estimates  of  invariant  subspaces  of  matrices  have  been  suggested 
in  [Chatelin],  [Dongarra,Moler,Wilkinson],  and  [Stewart].  These  three  methods  (henceforth 
called  C,  DMW,  and  S,  respectively),  all  solve  apparently  different  equations,  since  they 
represent  the  desired  invariant  subspace  slightly  differently.  However,  by  a  simple  diange  of 
basis,  we  will  see  that  all  three  methods  are  attempting  to  solve  the  same  equation,  the  Ric- 
cati  equation: 

AR  -  RB  =  C  +  RDR 

for  R,  which  represents  the  error  in  the  initial  estimate  of  the  invariant  subspace. 

The  benefit  of  this  unified  point  of  view  is  threefold.  First,  it  allows  the  same  conver- 
gence theory  to  be  applied  to  £ill  three  methods,  yielding  the  same  criterion  for  linear  conver- 
gence of  DMW  and  S,  and  a  slightly  stronger  criterion  for  quadratic  convergence  of  C. 
Second,  it  suggests  a  hybrid  algorithm  (Algorithm  3  below)  combining  advantages  of  DWM, 
C  and  S.  Third,  it  suggests  analogous  algorithms  (and  convergence  theory)  for  the  general- 
ized eigenproblem. 

The  Riccati  equation  (and  its  variations)  have  been  a  central  object  of  study  in  the  con- 
trol systems  commxmity  for  some  time,  and  a  nimiber  of  algorithms  have  been  proposed  and 
implemented  [Arnold, Laub].  The  algorithms  used  there  mirror  the  ones  in  the  numerical 
analysis  community:  converting  the  Riccati  equation  to  an  eigenvalue  problem  and  using  the 
QR  or  QZ  algorithms,  followed  by  Newton  iteration  to  refine  the  solution  if  necessary. 

This  paper  is  organized  as  follows.  After  introducing  some  notation  in  section  2,  we 
show  how  the  methods  DWM,  C  and  S  all  reduce  to  solving  the  Riccati  equation  in  section  3. 
Section  4  discusses  a  linear  iteration  and  Newton  iteration  for  solving  the  Riccati  equation, 
and  gives  convergence  criteria.  Section  5  presents  three  generalizations  of  DWM,  C  and  S  for 
solving  the  generalized  eigenproblem,  and  shows  how  they  may  be  again  reduced  to  solving  a 
generalized  Riccati  equation.  Section  6  generalizes  the  results  of  section  4  to  solving  this 
generalized  equation.  Section  7  discusses  the  relative  costs  of  the  linear  iteration  and  Newton, 
and  suggests  a  hybrid  algorithm  which  may  be  seen  as  a  form  of  modified  Newton.  Section  8 
discusses  future  work;  a  numerical  comparison  of  DWM,  C  and  the  hybrid  algorithm 
presented  below. 

2.  Notation 

The  n  by  n  matrix  whose  invariant  subspace  we  desire  will  be  called  A .  The  invariant 
subspace  we  seek  will  be  m  dimensional.  We  will  measure  errors  using  the  2-norm  for 
matrices 

||r||-su^||rx||/|MI 

(||j:||  denotes  the  2-norm  for  vectors)  and  the  Frobenius  norm 

\\T\\r  -  (2  \T^j?)'^  . 
U 

3.  Redaction  to  the  Riccati  Equation 

We  first  consider  the  method  C  of  Chatelin.  She  seeks  an  n  by  m  matrix  X  whose 
columns  span  the  desired  invariant  subspace.  Her  system  of  equations  depends  on  a  fixed  full 
rank  «  by  m  matrix  Z: 

AX  -  XiZ*AX)  =  0        ,       Z*X  =  I  (C) 

Now  change  basis  so  that  (in  the  new  basis)  Z=[/|0]'".  If  Z  consists  of  orthonormal  columns 


(and  there  is  no  reason  it  should  not),  then  this  change  of  basis  can  even  be  orthonormal.  We 
will  also  call  the  new  transformed  matrix  A,  and  write  it  as  the  partitioned  matrix 


A  = 


A 


12 


A21  A22 


where  A^^  is  m  sub  m  and  A22  is  n-m  by  n-m.  In  this  coordinate  system  it  is  easy  to  see 
Z*X=I  implies  X  is  of  the  form 

7 

X  = 

where  /?  is  an  arbitrary  n-m  by  m  matrix.  This  lets  us  rewrite  (Q  as 

A21  +  A22R  -  /?(Au  +  A^R)  =  0 


or 


A22R  -  RAii  =  -A21  +  RAJt 


(C) 


the  promised  Riccati  e^juation. 

In  the  DMW  method  Dcngarra,  Moler  and  Wilkinson  try  to  solve  the  equation 

AX  =  XB  pWM) 

simultaneously  for  the  n  by  m  matrix  X  and  the  m  by  m  matrix  B.  Since  this  is  rm  equations 
in  nm  +  m?  unknowris,  X  is  constrained  by  holding  m  of  its  rows  fixed,  reducing  the  number 
of  unknowns  to  nm .  The  m  rows  of  X  to  hold  fixed  are  chosen  to  be  the  "most  nonsingular" 
m  by  m  submatrix  of  the  initial  approximation  to  X.  In  practice,  these  may  be  determined  by 
LU  with  pivoting.  If  the  initial  approximation  of  X  has  orthononnal  columns,  by  an  ortho- 
normal  change  of  basis  we  may  assume  that  it  consists  of  the  first  m  columns  of  the  identity 
matrix.   Thus  in  this  new  basis  X  may  be  written 

7 
R 

with  R  arbitrary  as  before,  allowing  us  to  rewrite  (DWM)  as 


A^i  +  A^/? 
A21  +  A227? 


B 
RB 


Substituting  the  first  equation  for  B  into  the  second  yields 

A22R  ""  ^"11  ^  """21       RA^^R    y 


(DWM') 


the  same  Riccati  equation  as  before. 

Finally  we  consider  Stewart's  method  S.  Artually,  S  was  originally  presented  not  as  an 
algorithm  but  as  a  technique  for  doing  perturbation  theory  for  invariant  subspaces.  Nonethe- 
less, it  works  as  an  algorithm  as  well,  and  the  perturbation  theory  derived  by  Stewart  will 
later  be  used  to  derive  convergence  criteria  for  all  three  algorithms.  S  begins  with  an  n  by  m 
matrix  X  spanning  an  approximate  invariant  subspace  and  a  n  by  n-m  matrix  X'  spanning  a 
complementary  subspace,  so  that  the  matrix  {X\X'\  is  nonsingular.  The  true  invariant  sub- 
space  is  represented  as  X  -1-  X'R,  where  /?  is  n-m  by  m  as  before,  and  an  equation  is  derived 
for  R  as  follows.  X+X'R  will  be  invariant  if  and  only  if  the  lower  left  n-m  by  m  block  of 

X  1 


[X+X'R\K']-^  ■  A  ■  [X-¥X'R\X']  =  (  [Xpf']- 


)-^  ■  A  ■  {[X\X-]- 


) 


is  zero.    As  before,  we  change  basis  so  that  [Xpf']  is  the  identity  matrix.  If  X  consists  of 


orthonormal  columns  and  X'  spans  the  orthogonal  complement  of  the  space  spanned  by  X, 
this  change  of  basis  can  be  orthononnal.  In  this  new  basis  we  get  that  the  lower  left  n-m  by 
m  comer  of 


7  0 

R   I 

-1 

-^n  ^12 

^21   ^^22 

• 

/  0 

R  I 

= 

I     0 
-R  I 

■ 

^11    ^12 
•^21   ^22 

• 

/   0 
R  I 

must  be  zero,  or 

A22"       "^11  ^  ^21       RA^R    , 
the  same  Riccati  equation  as  before. 


(S) 


4.  Solving  the  Rkcsti  E^aaticn 

There  are  two  basic  approaches  to  solving  the  Riccati  equation,  the  iteration 

^22^/+i  -  ^(+1^11  =  -^2i  +  Mu'?/      .      *o=0  (Iter) 

and  Newton's  method.  DWM  and  S  use  iteration,  and  C  uses  Newton.  It  turns  out  that 
Stewart's  perturbation  analysis  yields  a  useful  convergence  criterion  for  (Iter)  in  terms  of 
simply  computable  properties  of  the  blocks  Ay.  Stewart's  analysis  appears  to  be  repeated  in 
part  in  the  later  paper  by  Dongarra,  Moler,  Wilkinson.  (Since  Dongarra,  Moler  and  Wilkin- 
son actually  solve  (DW^^)  by  a  Gauss-Seidel  type  of  iteration  where  newly  updated  com- 
ponents of  X  and  B  are  used  to  update  later  components,  their  analysis  is  slightly  different.) 
A  very  similar  analysis  yields  almost  the  same  criterion  for  (C)  to  have  quadratic  conver- 
gence. This  quadratic  convergence  criterion  does  not  appear  in  Cbatelin's  paper  and  seems  to 
be  stronger  than  her  results. 

To  state  the  results  we  need  to  define  the  separation  of  two  matrices  iep(A^^,A22) 
[Stewart]:  it  is  the  smallest  singular  value  of  the  linear  operator  which  maps  R  to 
^22^  -  R^ii-  It  is  nonzero  (i.e.  the  operator  is  invertible)  as  long  as  A^  and  A22  have  dis- 
joint spectra,  and  also  satisfies  sep(A,fl)  =  sep(fl,i4).  One  can  also  show  that  it  is  the  smal- 
lest singular  value  of  the  m(n  —  m)  by  m (n-m)  matrix 

-   AT.t 


Theorem  1:  [Stewart]  Let 


Then  if 


I„<^A22    -    A[;®/, 


\\An\\F  ■  \\A2:\\f 


sep2(Aii,A22) 


K   < 


1 


^  (LinearConvergence) 

(Iter)  converges  linearly  to  a  solution  R  which  is  the  unique  solution  inside  the  ball 

llRll    ^   1  -  (1-4k)^'^     _!lliil!f__<2  ll^2illir 

^  2k  scp(Aii^22)  '  »ep(A  11^22)    " 

Furthermore,  the  convergence  is  linear  with  a  contraction  constant  of  no  more  than 

1  -  (1-4k)^'2   . 

The  parameter  k  can  be  interpreted  as  fellows.  Its  numerator,  ||Ai2l|jr  •  ||A2il|jr,  meas- 
ures the  quality  of  the  initial  approximate  invariant  subspace:  it  will  be  small  when  the 
approximation  is  good,  and  the  factor  |lA2il!/r  will  be  zero  if  and  only  if  the  initial  approxima- 
tion is  in  fact  correct.    llA^IU  will  be  small  if  X'  (in  S)  is  also  a  good  approximate  invariant 


subspace.  The  denominator  sep  measures  the  separation  of  the  spectra  of  A^^  and  Aji-  If  sep 
is  small  it  means  some  eigenvalues  of  A^;  and  A22  can  be  made  to  merge  with  small  changes 
in  both  An;  this  means  the  invariant  subspaces  belonging  to  the  two  parts  of  the  spectrum  are 
unstable  and  hard  to  compute.  Thus,  k  will  be  small  if  we  start  with  a  good  initial  approxi- 
mate invariant  subspace  and  if  the  eigenvalues  associated  with  that  subspace  are  well 
separated  from  the  remainder  of  the  spectrum. 

Now  we  turn  to  the  quadratic  convergence  of  C.  As  in  [Chatelin],  it  is  easy  to  see  that 
Jacobean  of  F(X)  ^  AX  -  X(Z*AX)  with  respect  to  X  is  the  linear  map  which  maps  S  to 
DF(X)iS)  -  iI-XZ*)AS  -  S(Z*AX).  The  standard  Newton  iteration  would  then  be 

X,^,  =  X,-  DF(X,r'F(X,) 

At  first  glance  this  seems  problematic  since  there  are  no  guarantees  that,  first,  the  condition 
Z*Xt-l  is  fulfilled  for  all  iterations  k,  and,  second,  that  DF{X^)  is  nonsingular  for  all  k.  In 
fact,  if  X  is  the  true  invariant  subspace,  then  DF(X)  will  be  singular  if  Z*AX  is  singular,  and 
if  the  spectrum  whose  corresponding  subspace  we  seek  contains  zero,  this  will  be  the  case. 
Chatelin  deals  with  these  two  problems  by  first  showing  that  if  Z*Xq=I,  then  all  subsequent 
iterates  also  satisfy  Z*Xt=I,  and  second  assuming  that  by  shifting  A  to  A  +  ct/  sing^l]arity  of 
Z*AX  can  be  avoided. 

We  show  that  the  same  change  of  basis  we  used  above  leads  to  a  simplified  Newton 
iteration  which  not  only  guarantees  Z*X^=I  but  eliminates  the  artificial  restriction  on  the 
spcctnmi  of  Z*AX.  This  will  also  lead  to  a  new  convergence  criterion  only  slightly  more  res- 
trictive than  (LinearConvergence)  above.  LetZ  =  [/|0]  as  above  and  write 


X,= 


I 


From  this  it  follows  straightforwardly  that  Z*F(X,^)  =  0.   Also  in  this  coordinate  system 


DF(X,)i 


)  = 


-S,(A,,+A:^R,) 


from  which  we  see  that  in  solving 

we  may  take  S=0.  This  corresponds  to  Chatelin's  statement  that  DF{X)~^  maps  the  space  of 
matrices  M  such  that  Y*M  =  Q  into  itself.  Since  R^^y=R^-Z2,  a  little  more  manipulation  leads 
to  the  iteration 

('42:-Mi2)^i+i  -  ^i+iC'^u+'^ii^t)  =  -^21  -  Mi2^i  .  (r^ewton) 

a  set  of  linear  equations  for  ^;+i.  This  is  the  same  iteration  one  gets  applying  Newton  to 
Ai-Ji  -  ^11  ■•■  ^21  ~  RAJl  =  0  directly. 

The  next  theorem  states  a  condition  under  whidi  (Newton)  converges: 
Theorem  2:  Let  k  be  defined  as  in  Theorem  1.  Then  if 

K  <  -— -  ,  (QuadraticConvergence) 

(Newton)  will  converge  quadratically  to  the  unique  solution  R  inside  the  ball  given  in 
Theorem  1.  If  £,■/?,-/?  is  the  error  at  the  i-th  step  of  the  iteration,  then 


Proof:  Take  (Newton),  subtract  from  it  the  same  equation  with  the  solution  R  substituted  for 
R^  and  R^  +  i,  and  rearrange  to  obtain 

implying 

'^**^l^  ^  sep(A,,+A^,^22-Mu)  ^  ^ 

Thus  quadratic  convergence  is  evident,  as  long  as  the  denominator  is  bounded  away  from  0. 
To  prove  this  we  need  bounds  on  the  growth  of  ||/?il|jr. 

From  (Newton)  we  see  that 

,^    I,  ^     l^2illf  -^  \M\f\K\\' 

From  [Stevyarl.]  we  have  the  lemma  that 

sep(A+£^+F)  a  sep(A^)  -  \^\\p  -  \\F\\f  (Lemma) 

which  implies  that 


WRk.iW 


Abbreviating 


we  see  that 


\\A 

hWf  +  \M\f-\K\? 

sep(/li 

1^22)  -  2  llA^I^  \\R,\\, 

ll^.ll^ 

„     /fll^2lllf 

sep(A  11,^22) 

fk.i 

1       'i„/' 

with  /o=0.  It  is  easy  to  see  that  this  recurrence  for /^  increases  monotonically  and  approaches 
(1-(1-12k)^'^)  /  6k  as  a  limit.  Thus 

"^*"^  6k  itpiA,,A22) 

Substituting  this  into  (*)  above  and  using  (Lemma)  we  have 

3  llAnWr 


WEk.iWr  ^  \K\\^ 


sep(An^22)(2+(l-12K)^'2) 
which  concludes  the  proof.  Q.E.D. 

Thus  we  see  that  Stewart's  framework  provides  a  convenient  tool  for  analyzing  conver- 
gence. We  also  see  that  the  two  convergence  criteria  (LinearOjnvergence)  and  (Quadra- 
ticConvergence)  are  almost  the  same,  (QuadraticConvergence)  being  slightly  more  restric- 
tive. 

It  is  possible  to  make  a  matrix  interpretation  of  Newton's  method.  It  is  equivalent  to 
the  following  algorithm: 


Algorithm  1: 

1)  Given  bases  X  and  X'  for  an  approximate  invariant  subspace  and  a  complementary 
space,  transform  the  problem  so  that  [X|X']=/. 

2)  Take  one  step  of  (Iter)  with  Rq=0  yielding  R,  replace  X  by  the  better  estimate  X+X'/J 
and  return  to  step  1). 

In  matrix  notation  this  algorithm  produces  a  sequence  /?^*'  which  determine  a  sequence  of 
similarity  transforms  such  that 


/      0 


/       0 


/     0 


/     0 


-2)/?^''  / 


/"I 


2  /?")  / 


w-i 


converges  to  a  matrix  with  the  lower  left  n-m  by  m  comer  equal  to  0.   2)  ^^'^  **  ^^  *a™c  a* 
the  Ri,  generated  by  Newton. 

We  may  see  this  as  follows.  Obviously  /?^^)  is  the  same  as  the  Ry  produced  by  (Newton). 

k  k 

Now  assume  by  induction  that  2)  ^^'^  =  ^k-   Abbreviate  2  '^^'^  by  R^.   Executing  step  1) 

/-I  /-I 

amounts  by  induction  to  transforming  A  to 


/      0 

-R^  I 


I    0 
R^  I 


^11  ■*"  ^12^  ^12 

A  A^]_^A  Aj^**       ^21        22^         22~"  12 

Executing  step  2)  means  solving 

(A22-/?^/\i.)/?(**^)  -  /?(*+i)(Aii+Ai2^2)  =  ^^Aii+^^A^/J^-Aji-Ajz/?^ 
or 

(A22-/?^-4i:)(/?^+/?^*^^0  -  (/?^+/f(**^')(Aii+Ai2^^)  =  -A2:--R^i4^/?2 


*+i 


for  J?'*"^^).  Comparing  with  (Newton)  shows  i?(*'^^)+/?2  =  2  ^^'^  =  '^i+i  »*  desired. 

/-I 


5.  The  Generalized  Eigenproblem 

The  methods  described  above  generalize  to  the  regular  generalized  eigenproblem 
A-\B.  We  require  regularity  {A-\B  square  and  det(A-XJ)  not  identically  zero)  since  the 
eigenproblem  is  otherwise  not  well-posed.  The  concept  of  invariant  subspace  for  the  standard 
eigenproblem,  a  space  X  satisfying  AX^X,  is  replaceid  by  a  pair  of  deflating  subspaces  X  and 
Y  of  equal  dimension  such  that  AX+flX=Y.  Thus  any  algorithm  is  at  least  conceptually 
going  to  determine  two  spaces  simultaneously.  In  this  section  we  show  how  C,  DMW  and  S 
generalize,  how  they  are  all  equivalent  to  solving  a  generalized  Riccati  equation,  and  how  the 
same  convergence  theory  applies  to  all  three  as  before. 

First  we  show  how  to  generalize  Chatelin's  algorithm  C.  We  will  seek  two  n  by  m 
matrices  X  and  Y  as  bases  for  X  and  Y.  As  before,  we  normalize  X  and  Y  by  another  full 
rank  n  by  m  matrix  Z: 


AX  -  Y(Z*AX)  =  0  Z*X  =  I 

BX  -  Y(Z*BX)  =  0      '      Z*Y  =  I 


(GenC) 


As  before  we  change  bases  (which  means  performing  the  same  equivalence  transformation  on 
A  and  B)  so  that  Z  =  [/|0]''  and  rewrite  X  and  K  as 


X  = 


/ 

R 


■fl- 


As  before  we  call  the  transformed  matrices  A  and  B  and  partition  tLem  as 


^11   ^21 


1 12    f^ll 


rBn  B 


and 


n  ''21 


with  All  ^""^  ^n 


(GenC) 


^12    ^22] 

being  m  by  m  and  A22  and  B22  being  n-mbyn  — m.  IfZ  consists  of  ortho- 
normal  columns,  this  transformation  may  be  taken  to  be  orthonormal.  Plugging  into  (GenC) 
yields 

A22^  ~  ■^^11  ~   ~^21  "^  LA]Ji 

DjjK  ~  '^ u.  ~  ~°2i  "^  LB ]^ 

the  promised  generalized  Riccati  equation. 

There  are  at  least  two  generalizations  of  DMW  to  the  generalized  eigenproblem.  One 
might  try  to  solve  the  equation 

AX  =  BXC 

where  X  is  an  n  by  m  matrix  whose  columns  span  X,  and  C  is  an  m  by  m  matrix.  The  prob- 
lem with  this  generalization  of  DMW  is  Aat  it  assumes  BX  is  of  full  rank,  since  otherwise  C 
might  not  exist.  If  the  pencil  A-\B  has  an  infinite  eigenvalue  whose  deflating  subspace  X  we 
want,  then  BX  will  not  be  of  full  rank  (numerically  it  will  be  nearly  rank  deficient)  and  con- 
sequently C  will  be  very  ill  conditioned  and  hard  to  determine.  If  one  knows  B  is  well  condi- 
tioned, however,  this  method  could  be  used. 

A  better  generalization  of  DMW  is 

AX  =  YAy      ,      BX  =  YBy  (GenDMW) 

where  X  and  Y  are  both  n  by  m  matrices  and  Ay  and  By  are  both  m  by  m  matrices.  As  in 
DMW,  we  will  hold  m  rows  each  of  X  and  Y  constant,  so  that  (GenDMW)  consists  of  2nm 
equations  in  Inrn  unknowns. 

Performing  an  equivalence  transformation  so  that  X  and  Y  are  of  the  form 


X  = 


Y  = 


tl 


and  plugging  into  (GenDMW)  yields 


Aj^2.  "*     "12**    ^  Ay 

^21  +  ^2:^  =  ^y 


B21   t"    °22        ^^   LBy 


Substituting  the  expressions  for  Ay  and  By  into  the  second  equations  yields 

A22R  -  LAii  =  -A21  +  LAJi 
B22R  ~  ^-B II  ~  ~"2\  "^  LB i^R 


(GenDMW) 


the  same  generali.red  Riccati  equation  as  before. 

Stewart  also  deals  with  the  generalized  eigenproblem  in  [Stewart].  Again,  this  work  was 
presented  originally  not  as  an  algorithm  but  as  a  technique  for  doing  perturbation  theory. 
Stewart  takes  n  by  m  matrices  X  and  Y  spanning  approximate  deflating  subspaces  as  well  as  n 
by  H-m  matrices  X'  and  Y'  spanning  complementa]7  subspaces  so  that  [X\X']  and  [Y\Y']  are 
nonsingular.  The  true  deflating  subspaces  are  represented  as  X+X'R  and  Y+Y'L,  where  R 
and  L  are  n-m  by  m  matrices  to  be  determined.  An  equation  for  R  and  L  is  determined  as 
follows.  X+X'R  and  Y  +  Y'L  will  be  deflating  if  and  only  if  the  lower  left  n-m  by  m  corners 


of 


and 


[Y+Y'L\Y']-^  ■  A  ■  [X+n?|X']  =  ([Y\Y']- 


[Y+Y'L\Y']'^  ■  B  •  {X-^X'R\K']  =  ([Y\Y']- 


I  0 
L  I 


1  0 
L  I 


r' ■  A  ■  i[x\x']- 


r' ■  B  ■  i[x\x']- 


I  0 
R  I 


1  0 
R  1 


) 


are  zero.  As  before,  transform  from  the  right  and  left  so  that  [X\X']  and  [Y\Y']  are  identity 
matrices,  if  X  and  Y  consist  of  orthonormal  columns  and  X'  and  Y'  span  the  orthogonal  com- 
plements of  the  spaces  spanned  by  X  and  Y,  respectively,  then  this  change  of  bases  can  be 
orthonorroal.  In  this  new  bases  we  get  that  the  lower  left  comers  of 


/  0 

L  I 

-1 

^21   '^22 

7  0 

R  I 

and 

7  0 
L  I 

-1 

^21   ^22 

7    0 

R    I 

must  be  zero,  or 

A22R  ""  ^^11  ^  ^"21       LiA^^R 
D22"  —  ^-^11  ^  ~°21  "^  LB  ^2R 

the  same  generalized  Riccati  equation  as  before. 


(GenS) 


6.  Solving  the  Generalized  Rlccsti  E^aatioD 

As  for  the  earlier  Riccati  equation,  there  are  two  basic  approaches  to  solving  the  gen- 
eralized Riccati  equation,  the  iteration 

^22^t+i  ~  ^;+i^ii  ~  ~^2i  "*"  ^fin^i 


/?o  =  0 


Lo  =  0 


(Genlter) 


and  Newton's  method.  Also  as  before,  Stewart  provides  a  convergence  criterion  for  (Genlter) 
in  terms  of  easily  computable  properties  of  the  blocks  A,j  and  B,j.  A  similar  analysis  leads  to 
a  slightly  stronger  criterion  for  the  convergence  of  Newton's  method. 

To  state  Stewart's  result  we  need  to  generalize  the  sep  quantity  used  before.  Following 
Stewart  we  define  dif (A, ^yA--'^  11,822)  as  the  smallest  singular  value  of  the  linear  operator 
that  maps  {R,L)  to  {A22R-LA^^,B22R-LB,{).  It  is  nonzero  (i.e.  the  operator  is  invertible)  as 
long  as  the  pencils  A  i—\B  ^  and  ^422— ^^22  h^^*  disjoint  spectra.  One  can  also  show  that  it  is 
the  smallest  singular  vjilue  of  the  2m{n  —  m)  by  2m(n-m)  matrix 

I„®A22   -A[ig7„_„ 


Theorem  3:  [Stewart]  Let 


Then  if 


ll(Al2>gl2)llF  •  H(>^21>g2l)llf 

(^HA^^A22-^n^22) 


'<i 


(GcnLincarConvergence) 
(Genlter)  converges  linearly  to  a  solution  (RX)  which  is  the  unique  solution  inside  the  ball 


Furthermore,  the  cxinvergence  is  linear  with  a  contraction  constant  of  no  more  than 

1  -  (1-4k)^  . 

On  comparing  Theorems  1  and  3,  we  see  they  are  almost  identical,  and  indeed  Stewart 
derives  them  both  as  special  cases  of  a  more  general  theorem  which  says  when  the  iteration 

Tx,^i  =  g  -  ^(xi) 

converges  to  a  solution  of  Tx  =  g  -  4>(x),  Jt  a  member  of  a  Banach  space,  T  an  invertible 
linear  operator,  g  a  constant  member  of  the  Banach  space,  and  <>  a  "quadratic"  operator. 
Also,  the  interpretation  of  k  is  similar:  it  is  small  if  the  initial  approxir-.ste  deflating  sub- 
spaces  are  good  approximations  and  if  tlie  spectrum  associated  with  the  desired  deflating  sub- 
space  is  will  separated  from  its  complement. 

Now  we  turn  to  the  convergence  of  Newton's  method.  The  same  approach  as  before 
>-ields 

('*22-Mu)^i+i  -  '?i+i('^ii+^u'?J  =  -^21  -  Mi2«i        (GenNewton) 

a  set  of  linear  equations  for  (/?i+i,Li+;).    The  next  theorem  states  a  condition  under  which 

(GenNewton)  converges: 

Theorem  4:  Let  k  be  defined  as  in  Theorem  3.  Then  if 

K  <  —  (GenQuadraticConvergence) 

(GenNewton)  will  converge  quadratically  to  the  unique  solution  (/?,/,)  inside  the  ball  given  in 
Theorem  3.  If  f^  -  R^-R  and  Fj  ■  L^-L,  then 

ll(£..:,F,.,)|!,  ^  — l^i^lML  .  1  .  ||(£^^^)|^  . 
dU(i4, 1,^2:^6' 11,^22)       *■ 

Proof:  The  proof  is  analogous  to  the  proof  of  Theorem  3.  Take  (GenNewton),  subtract  the 
same  equation  with  R  substituted  for  ^j  and  /?4+i  and  L  substituted  for  I^  and  I.jt+i,  and  rear- 
range to  obtain 

iA22~^k^n)^k^i  -  Fi+i(Aji4-A^/?i)  =  -F^^E^ 

i^22~^k^i2)^k+i  ~  f^k->-ii^n'^^n^k)  ~  ~^k^n^k 

implying 

\m,:,F,,0\\r^  dif(A,,+A^„A:2-Mi2;flu+fli2/?.^22-Mi2)    '  ^     ^ 

Thus,  quadratic  convergence  is  evident  as  long  as  the  denominator  is  bounded  away  from  0. 
To  prove  this  we  need  bounds  on  the  growth  of  ||(i?i,I.i)|!/r. 

From  (GenNewton)  we  see  that 

|,.^       ,      .„    ^  \\(Ar.^2J\\F  +  \\(A:,^^)\\r\\(R,XM 

III  k^i^k.i)\\F       ail(A,,+A:Ji,A,2-I^kAi2,Bn-^B^,^,2-LkBn)    ' 
From  [Stewart]  we  have  the  lemma  (Genl^mma)  that 
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I 


which  implies  that 

,,.  .,,    ^  ||(/1::^;J|U  4-  ||(A^^^)||;r-||(i?,A)ll> 

Abbreviating 

im.iA.JWf      dif(Aa^22;fiu.B2:) 
we  sec  that 


A 


1  -  2  kA 


with  /o=0.  This  is  the  same  recurrence  as  in  Theorem  3,  and  it  is  easy  to  see  the  rest  of  the 
proof  follows  as  in  the  earlier  theorem.   Q.E.D. 

7.  The  cost  of  Iteration  versos  Newton 

In  this  section  we  discuss  ways  to  solve  the  linear  equations  needed  at  eadi  iteration  of 
the  algorithms,  and  show  that  (Iter)  and  (Genlter)  cost  only  O(n^)  arithmetic  operations  per 
iteration  after  a  preprocessing  cost  of  O(n^),  whereas  (Newton)  and  (GenNewton)  cost  O(n^) 
operations  per  iteration. 

Let  us  consider  the  standard  eigenproblem  Hrst.  Iteration  requires  the  solution  of  the 
set  of  linear  equations 

for  /?,+i  at  each  iteration.  This  equation,  called  Sylvester's  equation,  has  had  solutions  pro- 
posed in  both  [Bartels,  Stew  art]  and  [Golub,Nash,Van  Loan].  Both  algorithms  have  the  pro- 
perty that  after  a  preprocessing  step  (reducing  An  to  triangular  or  Hessenberg  form)  whidi 
takes  0(m^+(n-m)^)  operations,  it  is  possible  to  solve  (Iter)  for  R,+i  in  just 
0((n-m)m-  +  m(n  — m)*)  operations.  This  savings  is  possible  because  the  preprocessing 
reduces  (Iter)  to  a  logically  trianguJar  system.  If  m«n,  a  common  case,  preprocessing  is 
Oin^)  and  solving  is  O(n^).  Thus  (Iter)  is  cheap  after  first  preprocessing. 

Newton,  on  the  other  hand,  requires  the  solution  of 

at  each  step.  This  is  also  a  Sylvester  equation  for  /?,+i,  but  now  the  coefficient  matrices 
Aii-RfA^  and  A^i-¥Ai2Ri  vary  from  step  to  step,  making  it  necessary  to  preprocess  at  each 
step  or  at  least  frequently.  Thus  each  step  could  take  0(n-')  operations  instead  of  O(n^).  This 
fact  seems  to  make  a  modified  Newton  method  in  which  the  coefficient  matrices  are  updated 
either  only  approximately  or  only  occasionally  attractive.  In  particular,  it  seems  (Newton) 
could  be  solved  iteratively  for  i?,+i  using  a  quick  solver  of  A2^-XAii=Y  as  an  approximate 
invene  for  iterative  refinement  This  would  be  especially  true  if  the  matrix  were  nearly  block 
triangular  so  that  R  were  small;  then  the  operator  A2^-XA^i  would  be  a  good  approximation 
to  {A22-RA]2)^~^i^ii^^]J^i)-  However,  as  we  now  show,  we  should  not  expect  this 
scheme  to  be  superior  to  using  (Iter)  for  a  while  to  get  an  approximate  R,  updating  the 
matrix  by  transforming  so  that  [X-irX'R^']  is  the  identity,  and  using  (Iter)  again.  Letting  5, 
denote  the  jth  iterate  in  the  following  iterative  refinement  algorithm  for  (Newton): 
Algorithm  2: 


U 


1)  Compute  the  residual  r,  ™  - A2i-RiAn^i~  i^22~^  1^12)^1'*' ^t(.^n'^^u^ i)- 

2)  Solve  Aj^Jj-d^ii  =  r,  for  the  correction  J,. 

3)  Update  S,+i=S,+ J,. 
After  some  manipulation  we  see 

^iiSi^-St^An  =  -A2i+Mi2'?/+Mu(^/-''/)+(^/-^/M:2^/  (ModNewton) 
Since  we  expect  5,  to  be  a  better  approximation  to  R  than  R„  we  could  let  /?,=5,  in  the  above 
formula.  But  then  (ModNewton)  would  be  identical  to  (Iter).  This  leads  us  to  recommend  the 
following  version  of  the  algorithm  given  earlier  (as  a  matrix  interpretation  of  Newton): 

Algorithm  3: 

1)  Given  bases  X  and  X'  for  an  approximate  invariant  subspace  and  a  complementary 
space,  transform  the  problem  the  problem  so  that  [XlX']=/. 

2)  Take  one  or  more  steps  of  (Iter)  with  Rq=0,  replace  X  by  the  better  estimate  X+X'R 
and  return  to  step  1). 

The  number  of  steps  of  (Iter)  to  take  in  step  2  above  would  depend  on  the  convergence  rate: 
if  it  is  fast  enough,  there  is  no  reason  to  return  to  step  1  and  pay  O(n')  operations. 

The  same  considerations  apply  to  the  generalized  eigenproblem.  It  turns  out  that  if  the 
All  and  fl„  are  triangular,  the  linear  system 

^22Rl  +  i  ~  ^/  +  1^11  ~  ^li^lfRl) 

^t2Ri+i  ~  ^i+i^n  ~  ^2(^1*^1) 
is  logically  block  triangular  with  2  by  2  blocks,  so  that  we  can  solve  for  the  entries  of  L,+i 
and  /?,+!  two  at  a  time  by  substitution.  (Analogs  of  both  the  [Bartels,  Stewart]  and 
[Golub,Nash,Van  Loan])  algorithms  are  possible.)  Thus  again  O(r')  operations  of  prepro- 
cessing allow  each  iteration  of  (Genlter)  to  cost  only  O(n^)  iterations.  Eadi  iteration  of 
(GenNewton),  on  the  other  hand,  costs  O(n')  as  above,  leading  us  to  recommend  an  algo- 
rithm like  Algorithm  3  above. 

It  is  of  interest  to  note  the  work  on  solving  the  Riccati  equation  in  the  control  systems 
community  ([Kleinman],[Amold,Laub]).  They  are  interested  in  solving  variations  of  the  Ric- 
cati equation  (S),  in  particular  when  Ai^=-Al2-  Their  standard  approach  is  to  transform  to 
the  corresponding  matrix  eigenvalue  problem  and  use  the  QR  algorithm.  If  more  accuracy  is 
needed,  they  use  Newton.  These  algorithms  have  been  implemented  in  a  package  of  FOR- 
TRAN subroutine  called  RICPACK.  Currently  none  of  these  algorithms  apply  to  the  general 
nonsymmetric  version  of  the  Riccati  equation  we  consider  in  this  paper,  just  the  special  form 
mentioned  above. 

8.  Fnture  Work 

The  approach  taken  in  this  paper  does  not  tell  us  which  of  the  algorithms  presented  has 
better  numerical  properties.  In  a  future  paper  we  intend  to  make  numerical  experiments  com- 
paring the  speeds  and  accuracies  of  the  algorithm  of  Dongarra,  Moler  and  Wilkinson,  the 
algorithm  of  Chatelin,  and  Alj^orithm  3  presented  in  the  last  section.  In  addition,  the  ver- 
sions of  these  algorithms  for  the  generalied  eigenproblem  will  be  programmed  and  com- 
pared. 
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